Finding Winner Alignments with Multiple Scoring Matrices

نویسندگان

  • Kuo-Tsung Tseng
  • Chang-Biau Yang
  • Yung-Hsing Peng
  • Chiou-Ting Tseng
چکیده

How to align two given sequences properly is a fundamental problem in bioinformatics. In the sequence alignment problem, the most essential thing that directly affects the resulting alignment is the scoring matrix. There are a variety of scoring matrices used for alignment, and each of them has its own purpose in biosequence alignment. It seems unlikely that an alignment is optimal for each scoring matrix. But, there may be one that meets the most matrices with acceptable scores. In this paper, we present an efficient algorithm for finding the winner alignment when multiple scoring matrices are applied. We then discuss the variants of the comparing function. By simply reordering the steps in the comparing function, we could obtain another winner alignment under different criterion. Our algorithm solves the problem in O(kmn) time where k is the number of scoring matrices involved, and m and n are the lengths of the two input sequences. A more efficient algorithm with O(k(|Σ|+ 1)2 + mn) time can find the winner alignment under the summing scheme, where |Σ| denotes the alphabet size of the input sequences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Global Pairwise Sequence Alignments with Multiple Scoring Matrices

A necessity of repeating alignments on a same pair of sequences to select an appropriate scoring function to detect a significance of score motivates us to work on higher performance matching technique. While overlapping each alignments obtained through a set of typical scoring matrix with its default gap cost, we observe significant parameters that may induce a maximum deviation from a referen...

متن کامل

Towards optimal alignment of protein structure distance matrices

MOTIVATION Structural alignments of proteins are important for identification of structural similarities, homology detection and functional annotation. The structural alignment problem is well studied and computationally difficult. Many different scoring schemes for structural similarity as well as many algorithms for finding high-scoring alignments have been proposed. Algorithms using contact ...

متن کامل

Selecting the Right Similarity-Scoring Matrix.

Protein sequence similarity searching programs like BLASTP, SSEARCH (UNIT 3.10), and FASTA use scoring matrices that are designed to identify distant evolutionary relationships (BLOSUM62 for BLAST, BLOSUM50 for SEARCH and FASTA). Different similarity scoring matrices are most effective at different evolutionary distances. "Deep" scoring matrices like BLOSUM62 and BLOSUM50 target alignments with...

متن کامل

Adjusting scoring matrices to correct overextended alignments

MOTIVATION Sequence similarity searches performed with BLAST, SSEARCH and FASTA achieve high sensitivity by using scoring matrices (e.g. BLOSUM62) that target low identity (<33%) alignments. Although such scoring matrices can effectively identify distant homologs, they can also produce local alignments that extend beyond the homologous regions. RESULTS We measured local alignment start/stop b...

متن کامل

Improved Sensitivity of Nucleic Acid Database Searches Using Application-Specific Scoring Matrices

Scoring matrices for nucleic acid sequence comparison that are based on models appropriate to the analysis of molecular sequencing errors or biological mutation processes are presented. In mammalian genomes, transition mutations occur significantly more frequently than transversions, and the optimal scoring of sequence alignments based on this substitution model differs from that derived assumi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008